27 research outputs found

    A Two-step Statistical Approach for Inferring Network Traffic Demands (Revises Technical Report BUCS-2003-003)

    Full text link
    Accurate knowledge of traffic demands in a communication network enables or enhances a variety of traffic engineering and network management tasks of paramount importance for operational networks. Directly measuring a complete set of these demands is prohibitively expensive because of the huge amounts of data that must be collected and the performance impact that such measurements would impose on the regular behavior of the network. As a consequence, we must rely on statistical techniques to produce estimates of actual traffic demands from partial information. The performance of such techniques is however limited due to their reliance on limited information and the high amount of computations they incur, which limits their convergence behavior. In this paper we study a two-step approach for inferring network traffic demands. First we elaborate and evaluate a modeling approach for generating good starting points to be fed to iterative statistical inference techniques. We call these starting points informed priors since they are obtained using actual network information such as packet traces and SNMP link counts. Second we provide a very fast variant of the EM algorithm which extends its computation range, increasing its accuracy and decreasing its dependence on the quality of the starting point. Finally, we evaluate and compare alternative mechanisms for generating starting points and the convergence characteristics of our EM algorithm against a recently proposed Weighted Least Squares approach.National Science Foundation (ANI-0095988, EIA-0202067, ITR ANI-0205294

    A Pragmatic Definition of Elephants in Internet Backbone Traffic

    Get PDF
    Studies of the Internet traffic at the level of network prefixes, fixed length prefixes, TCP flows, AS’s, and WWW traffic, have all shown that a very small percentage of the flows carries the largest part of the information. This behavior is commonly referred to as “the elephants and mice phenomenon”. Traffic engineering applications, such as re-routing or load balancing, could exploit this property by treating elephant flows differently. In this context, though, elephants should not only contribute significantly to the overall load, but also exhibit sufficient persistence in time. The challenge is to be able to examine a flow’s bandwidth and classify it as an elephant based on the data collected across all the flows on a link. In this paper, we present a classification scheme that is based on the definition of a separation threshold, that elephants have to exceed. We introduce two single-feature classification schemes, and show that the resulting elephants are highly volatile. We then propose a two-feature classification scheme that incorporates temporal characteristics and show that this approach is more successful in isolating elephants that exhibit consistency thus making them more attractive for traffic engineering applications

    Hmm-based monitoring of packet channels

    Get PDF
    Abstract. Performance of real-time applications on network communication channels are strongly related to losses and temporal delays. Several studies showed that these network features may be correlated and exhibit a certain degree of memory such as bursty losses and delays. The memory and the statistical dependence between losses and temporal delays suggest that the channel may be well modelled by a Hidden Markov Model (HMM) with appropriate hidden variables that capture the current state of the network. In this paper we discuss on the effectiveness of using an HMM to model jointly loss and delay behavior of real communication channel. Excellent performance in modelling typical channel behavior in a set of real packet links are observed. The system parameters are found via a modified version of the EM algorithm. Hidden state analysis shows how the state variables characterize channel dynamics. State-sequence estimation is obtained by use of the Viterbi algorithm. Real-time modelling of the channel is the first step to implement adaptive communication strategies.

    Identifiability of flow distributions from link measurements with applications to computer networks

    Full text link
    We study the problem of identifiability of distributions of flows on a graph from aggregate measurements collected on its edges. This is a canonical example of a statistical inverse problem motivated by recent developments in computer networks. In this paper (i) we introduce a number of models for multi-modal data that capture their spatio-temporal correlation, (ii) provide sufficient conditions for the identifiability of nth order cumulants and also for a special class of heavy tailed distributions. Further, we investigate conditions on network routing for the flows that prove sufficient for identifiability of their distributions (up to mean). Finally, we extend our results to directed acyclic graphs and discuss some open problems.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58107/2/ip7_5_004.pd

    Automated detection of regions of interest for tissue microarray experiments: an image texture analysis

    Get PDF
    BACKGROUND: Recent research with tissue microarrays led to a rapid progress toward quantifying the expressions of large sets of biomarkers in normal and diseased tissue. However, standard procedures for sampling tissue for molecular profiling have not yet been established. METHODS: This study presents a high throughput analysis of texture heterogeneity on breast tissue images for the purpose of identifying regions of interest in the tissue for molecular profiling via tissue microarray technology. Image texture of breast histology slides was described in terms of three parameters: the percentage of area occupied in an image block by chromatin (B), percentage occupied by stroma-like regions (P), and a statistical heterogeneity index H commonly used in image analysis. Texture parameters were defined and computed for each of the thousands of image blocks in our dataset using both the gray scale and color segmentation. The image blocks were then classified into three categories using the texture feature parameters in a novel statistical learning algorithm. These categories are as follows: image blocks specific to normal breast tissue, blocks specific to cancerous tissue, and those image blocks that are non-specific to normal and disease states. RESULTS: Gray scale and color segmentation techniques led to identification of same regions in histology slides as cancer-specific. Moreover the image blocks identified as cancer-specific belonged to those cell crowded regions in whole section image slides that were marked by two pathologists as regions of interest for further histological studies. CONCLUSION: These results indicate the high efficiency of our automated method for identifying pathologic regions of interest on histology slides. Automation of critical region identification will help minimize the inter-rater variability among different raters (pathologists) as hundreds of tumors that are used to develop an array have typically been evaluated (graded) by different pathologists. The region of interest information gathered from the whole section images will guide the excision of tissue for constructing tissue microarrays and for high throughput profiling of global gene expression

    Railway bridge structural health monitoring and fault detection: state-of-the-art methods and future challenges

    Get PDF
    Railway importance in the transportation industry is increasing continuously, due to the growing demand of both passenger travel and transportation of goods. However, more than 35% of the 300,000 railway bridges across Europe are over 100-years old, and their reliability directly impacts the reliability of the railway network. This increased demand may lead to higher risk associated with their unexpected failures, resulting safety hazards to passengers and increased whole life cycle cost of the asset. Consequently, one of the most important aspects of evaluation of the reliability of the overall railway transport system is bridge structural health monitoring, which can monitor the health state of the bridge by allowing an early detection of failures. Therefore, a fast, safe and cost-effective recovery of the optimal health state of the bridge, where the levels of element degradation or failure are maintained efficiently, can be achieved. In this article, after an introduction to the desired features of structural health monitoring, a review of the most commonly adopted bridge fault detection methods is presented. Mainly, the analysis focuses on model-based finite element updating strategies, non-model-based (data-driven) fault detection methods, such as artificial neural network, and Bayesian belief network–based structural health monitoring methods. A comparative study, which aims to discuss and compare the performance of the reviewed types of structural health monitoring methods, is then presented by analysing a short-span steel structure of a railway bridge. Opportunities and future challenges of the fault detection methods of railway bridges are highlighted

    Hidden Markov modeling for network communication channels.

    No full text
    In this paper we perform the statistical analysis of an Internet communication channel. Our study is based on a Hidden Markov Model (HMM). The channel switches between different states; to each state corresponds the probability that a packet sent by the transmitter will be lost. The transition between the different states of the channel is governed by a Markov chain; this Markov chain is not observed directly, but the received packet flow provides some probabilistic information about the current state of the channel, as well as some information about the parameters of the model. In this paper we detail some useful algorithms for the estimation of the channel parameters, and for making inference about the state of the channel. We discuss the relevance of the Markov model of the channel; we also discuss how many states are required to pertinently model a real communication channel

    A Pragmatic Definition of Elephants in Internet Backbone Traffic

    No full text
    this paper, we present a classification scheme that is based on the definition of a separation threshold, that elephants have to exceed. We introduce two single-feature classification schemes, and show that the resulting elephants are highly volatile. We then propose a two-feature classification scheme that incorporates temporal characteristics and show that this approach is more successful in isolating elephants that exhibit consistency - thus making them more attractive for traffic engineering application
    corecore